perm filename THESIS.DOC[2,TES] blob
sn#009908 filedate 1972-07-14 generic text, type T, neo UTF8
*** BLANK PAGE ***
STANFORD ARTIFICIAL INTELLIGENCE PROJECT JULY, 1972
MEMO AIM-
REPRESENTATION AND DESCRIPTION OF CURVED OBJECTS
by
Gerald Jacob Agin
ABSTRACT
A representation is proposed in which three-dimensional
objects are represented by data structures composed of
primitives called generalized cylinders. These primitives
consist of a space curve or "skeleton" and a cross section
which may vary along the length of the skeleton. Apparatus
and programs are described which obtain depth information by
scanning objects with a laser and television camera.
Results are presented from a set of programs which analyze
the laser-derived depth information and segment objects into
primitives describable as generalized cylinders. Methods
are proposed whereby a program may generate complete
descriptions of complex curved objects.
*** WORKING DRAFT -- July 14, 1972 ***
The research reported here was supported in part by the
Advanced Research Projects Agency.
The views and conclusions contained in this document are
those of the author and should not be interpreted as
necessarily representing the official policies, either
expressed or implied, of the Advanced Research Projects
Agency or of the U. S. Government.
Reproduced in the USA. Available from the National
--------- ---- --- --------
Technical Information Service, Springfield, Virginia 22151.
--------- ----------- ------- ----------- -------- -----
Price: full size copy $3.00; microfiche copy $0.95.
----- ---- ---- ---- ----- ---------- ---- -----
PREFACE TO THE WORKING DRAFT
This is the working draft of "Representation and Description
of Curved Three-Dimensional Objects. Copies of this
document are for review and comments only.
Present status:
Section needs writing.
Section and Section 5 need extensive revisions.
Update Section to reflect new calibration procedure.
Some figures remain to be drawn.
Fix up some of the references.
Some further research on fitting cross sections,
describing skeletons, and describing complex objects, if
time permits.
Comments and criticisms are invited.
Jerry Agin
Page iii
TABLE OF CONTENTS
1 INTRODUCTION Page 1
REFERENCES Page 5
Page iv
LIST OF ILLUSTRATIONS
Page 1
1 INTRODUCTION
My present interest in representation and description of
curved objects arose from a desire to extend the
capabilities of the Stanford Hand-Eye System [Feldman] to
recognize a wider class of objects than plane-bounded
solids. Initial attempts to recognize geometric cones,
cylinders, and spheres were not carried far enough to
demonstrate the usefulness of existing techniques in
recognizing this limited addition to the class of
recognizable objects. But there appears to be no
insurmountable barrier to doing so.
It soon became apparent that little useful purpose would be
served by this limited extension. A significant improvement
in performance of vision systems would come only when they
were capable of recognizing the sort of everyday objects
that a robot of the future might have to deal with. To
accomplish this, we make use of two new tools or techniques.
The first of these is a representation general and flexible
enough to carry varied information about an object, its
parts, and their relation to one another. The
representation we propose will allow several different
1.0 INTRODUCTION Page 2
models for primitives or parts of objects. These models may
include prototypes of various sorts. The particular model
we present in this paper, generalized cylinders, is useful
for describing a large class of natural and man-made
objects. It describes in a natural and intuitive way pieces
which possess elongation, or which have axial symmetry. The
generalized cylinders may be linked together in various ways
to form complex objects. The most significant departure
from previous methods of description is that the
representation is essentially a volume representation, as
opposed to a surface representation. Section 2 of this
report describes in detail representation by generalized
cylinders, and gives some examples of how objects might be
modelled with such a representation.
The second new technique is the use of direct depth
measurement for recognition. Every two-dimensional image of
a three-dimensional scene is ambiguous, in that there are an
infinite number of realizable physical objects (or groupings
of objects) which could give rise to the image. These
ambiguities may be resolved only by a priori assumptions
about the nature of the scene. The use of direct depth
information eliminates the need for assumptions such as
squareness of corners, requirements that stored prototypes
1.0 INTRODUCTION Page 3
exist for every possible object in a scene, or knowledge of
the reflective characteristics of surfaces. Section 3
describes a ranging system, which obtains depth information
by triangulation, using a laser and a television camera.
The test of any vision system must be how well it deals with
actual objects. Sections 4 and 5 describe an experimental
system to generate descriptions of physical objects, from
depth data derived from the ranging system. The system has
so far given good results in describing some simple curved
objects, and has been moderately successful in describing
parts of complex objects. Some results may be seen in
Figures ___. Section 6 contains some suggestions for
further research.
So far we have not achieved our goal of recognition of
complex objects; we cannot yet identify a given object, say,
as a humanoid figure or as a hammer. But we hope our method
of representation and our work in description of objects has
laid the groundwork for progress in this area.
I would like to express my appreciation for the guidance and
assistance of my advisor, Dr. Thomas O. Binford. His
insight has been helpful on many occasions. Many of the
1.0 INTRODUCTION Page 4
original ideas on which this research is based were his,
including generalized translational invariance, and the
basic configuration of a laser ranging system.
Thanks are also due to Dr. Arthur L. Schawlow for the
generous loan of a laser, to Victor Scheinman for the design
and assembly of the laser deflection apparatus, to Richard
Underwood for routines for the numerical solution of the
generalized eigenvalue equation, and to Bruce G. Baumgart,
R. K. Nevatia, and J. M. Tenenbaum for their helpful
comments and suggestions.
Page 5
REFERENCES
[Baumgart 72a] GEOMED ...
[Baumgart 72b] On the Representation of Physical Objects
__ ___ ______________ __ ________ _______
...
[Blum] Harry Blum, "A Transformation for Extracting New
Descriptors of Shape", Symposium on Models for
Perception of Speech and Visual Form, Boston,
November 11-14, 1964.
[Binford 70] Thomas O. Binford, "Triangulation by Laser",
December, 1970, unpublished.
[Binford 71] Thomas O. Binford, "Visual Perception by
Computer", presented at ...
[Coons] S. A. Coons and B. Herzog, "Surfaces for Computer-
Aided Aircraft Design", J. Aircraft, Vol 1, No. 4
__ ________
(July-Aug, 1968), pp 402-406.
[Courant] Differential and Integral Calculus, Interscience,
____________ ___ ________ ________
1936, Volume 2, pp. 190-199.
2.0 REFERENCES Page 6
[Coxeter] H. S. M. Coxeter, Introduction to Geometry, John
____________ __ ________
Wiley and Sons, 1961, pp321-326.
[DeBoor] Carl de Boor and John R. Rice, Least Squares Cubic
_____ _______ _____
Spline Approximation I - Fixed Knots, Purdue
______ _____________ _ _ _____ _____
University Report No. CSD TR 20, April 1968. Least
_____
Squares Cubic Spline Approximation II - Variable
_______ _____ ______ _____________ __ _ ________
Knots, Purdue University Report No. CSD TR 21, April
_____
1968.
[Earnest] Lester D. Earnest, Choosing an Eye for a
________ __ ___ ___ _
Computer, Stanford Artificial Intelligence Project
Memo AIM-51, April, 1967.
[Falk] ...
[Feldman] J. Feldman, K. Pingle, T Binford, G Falk, A. Kay,
R. Paul, R. Sproull, and J. Tenenbaum, "The Use of
Vision and Manipulation to Solve the `Instant
Insanity' Puzzle", Second International Joint
Conference on Artificial Intelligence, London,
September 1-3, 1971.
[Horn] Berthold Klaus Paul Horn, Shape from Shading: A
_____ ____ ________ _
2.0 REFERENCES Page 7
Method for Finding the Shape of a Smooth Opaque
______ ___ _______ ___ _____ __ _ ______ ______
Object from One View, Ph.D. Thesis, Massachusetts
______ ____ ___ ____
Institute of Technology, June, 1970.
[Krakaur] ...
[Mott-Smith] John Mott-Smith, unpublished ...
[Pingle] Karl K. Pingle, Hand/Eye Library, Stanford
________ _______
Artificial Intelligence Laboratory Operating Note
35.1, January, 1972.
[Roberts 63] L. G. Roberts, Machine Perception of Three-
_______ __________ __ ______
Dimensional Solids ...
___________ ______
[Roberts 65] L. G. Roberts, Homogeneous Matrix
___________ ______
Representation and Manipulation of N-Dimensional
______________ ___ ____________ __ _____________
Constructs, Document MS1045, Lincoln Laboratory,
__________
Massachusetts Institute of Technology, May, 1965.
[Shirai] Yoshiaki Shirai and Motoi Suwa, "Recognition of
Polyhedrons with a Range Finder", Second
International Joint Conference on Artificial
Intelligence, London, September 1-3, 1971.
2.0 REFERENCES Page 8
[Smith] Lyle B. Smith, The Use of Man-Machine Interaction
___ ___ __ ___________ ___________
in Data-Fitting Problems, Stanford Linear
__ ____________ ________
Accelerator Center Report No. 96, March, 1969.
[Sobel] Irwin Sobel, Camera Models and Machine Perception,
______ ______ ___ _______ __________
Stanford Artificial Intelligence Project Memo AIM-
121, May, 1970.
[Will] P. M. Will and K. S. Pennington, "Grid Coding: A
Preprocessing Technique for Robot and Machine
Vision", Second International Joint Conference on
Artificial Intelligence, London, September 1-3,
1971.